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ABSTRACT We report a general mass spectrumetric 
approach for the rapid identification and characterization of 
proteins isolated by preparative two-dimensional polyacryl- 
amide gel electrophoresis. This method possesses the inherent 
power to detect and structurally characterize covalent modi- 
fications Absolute sensitivities of matrix-assisted laser de- 
sorption ionization and high-energy collision-induced disso- 
ciation tandem mass spectrometry are exploited to determine 
the mass and sequence of subpicomolt sample quantities of 
tryptic peptides. These data permit muss matching and se- 
quence homology searching of computerized peptide mass and 
protein sequence data bases for known proteins and design of 
oligonucleotide probes for cloning unknown proteins. We have 
identified 11 proteins in lysates of human A375 melanoma 
ceils, including: a-enolase, cytokeratin, stathmin, protein 
disulfide isomerase, tropomyosin, Co Zn superoxide dis- 
mutase, nucleoside diphosphate kinase \ galaptin, and tri- 
usephosphate isomerase. We have characterized several post- 
translational modifications and chemical modifications that 
may result from electrophoresis or subsequent sample pro- 
cessing steps. Detection of comigrating and covalently modi- 
fied proteins illustrates the necessity of peptide sequencing 
and the advantages of tandem mass spectrometry to reliably 
and unambiguously establish the identity of each protein. This 
technology paves the way for studies of cell-type dependent 
gene expression and studies of large suites of cellular proteins 
with unprecedented speed and rigor to provide information 
complementary to the ongoing Human Genome Project. 



Cloning of genes associated with malignancy generates inev- 
rtable excitement in biology and medicine. However, subse- 
quent study a[ the protein level is cleariv necessary to under- 
hand the processes by which tnose genes affect vital biological 
Junctions, such bis the control of gene expression and regula- 
tion of cell-signaling pathways. Two-dimensional (2D) PAGE 
.s preferred for simultaneous separation and visualization of 
proteins present in cell lysates (1, 2). Protein identification and 
characterization of functionally important primary structural 
ieatures are the first steps toward gaming insight into the 
biological roles of specific proteins (3-6 1. Furthermore, char- 
acterization of the 2D- PAGE map horn a ee.l system facilitates 
both basic research and clinical diagnosis (3, 7 ) 

Traditional partial-sequencing approaches for identifying 
proteins isolated by 2D PAGE, such as lidman degradation of 
eiectroblotted proteins, often meet with limited success due to 
N'-:errr.inal blockage of many eukaryotic oroteins While pep- 
tide cleavage, extraction, and HPLC separation may enable 
uosequent Edman sequencing of internal peptide:; (.8), these 
procedures alone are not rapid enough to identify the thou- 
sands of proteins in a cell lysate in a timely manner. Because 
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of '.heir speed, sensitivity, and ability to deal directly with 
mixtures, several recently developed mass spectrometry tech- 
niques are rapidly becoming the primary methods for identi- 
fying proteins isolated by 2D PAGE (4-6, 9, 10). Previously, 
our laboratories repo-ted the advantages of using liquid sec- 
oncaiy ion mass spectrometry, hign-energy collision-induced 
dissociation (CID) tandem mass spectrometry, Edman se- 
quencing, and 2D PAGE to characterize lipocortin I from 
human melanoma lysates (4). Peptide mass determination and 
CiD sequencing using sample quantities of =™ 1 00 pmol re- 
vealed an acetylated N terminus and an unanticipated acryl- 
amide-modified cysteine. 

Recently, we have substantially reduced the detection limits 
of peptide sequencing through incorporation of continuous 
flow sample introduction and a scanning charge-coupled de- 
vice array detector onto our tandem mass spectrometer. The 
resulting chemical noise reduction and rapid recording of 
single GID spectra (=10 s) now enable routine peptide se- 
quencing from sample quantities of 100 "'mol to 10 pmol (It, 
12). Prior to CID, substantial sample is conserved by exploiting 
the 1-100 fmol sensitivity of matrix-assisted laser desorption 
ionization (MALDI), rather than liquid secondary ion mass 
spectrometry for preliminary mass measurement MALDI 
enaoles the generation of peptide molecular-mass maps from 
fmol levels of unfractionated protein digestr, or individual 
HPLC fractions. These peptide-mass fingerprints may be used 
to search peptide-mass data bases and predict protein identi- 
ties without resorting to sequencing (5, 9, 13). However, 
mass- Hatching algorithms can find only proteins which are 
already present in a data base. Obviously, primary structure 
elucidation is essential to identify unknown proteins, search 
sequence data oases for homologous proteins, characterize 
co/alcnt modifications, or design oligonucleotide probes for 
gene c:onmg. Taking these points into consideration, we report 
in integrated strategy for identifying and characterizing pri- 
mary structural features of proteins isolated by 2D PAGE.. 

MATERIALS AND METHODS 
Protein Isolation, Purification, Digestion, and Edman Se- 
quencing. Preparation of human A375 melanoma lysates, 
isolation of proteins by 2D PAGE i4, U), protein electrocu- 
tion of' pooled gel spots, tryptic digestion, HPLC separation, 
and Edman sequencing were performed as described (4). The 
procedure of Rosenfeld etai (15) was used to produce an in-ge! 
■.ryptic digest ot spot 42. Control digestions without protein 
substrate were performed so that tn psin autoproteolvsis prod- 
ucts couid be disregarded. 
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Mass Spectrometry. Molecular masses (isotopic average) of 
all tryptic peptides were determined by analyzing I /50th of 
each HPLC fraction with a VG TofSpec MALDI mass spec- 
trometer equipped with a nitrogen laser and operated in the 
linear mode. Peptides were crystallized in matrices consisting 
of 100 mM 2,4-dihydroxybenzoic acid/50 mM fucose or a 
saturated solution of a-cyano-4-hydroxycmnamic acid pre- 
pared in 0 1% aqueous trifluoroacetic acid (TFA) The mean 
mass from all spectra recorded for a particular peptide is 
reported. All MALDI spectra were externally calibrated by 
using a standard peptide mixture. High-energy positive ion 
CID mass spectra were acquired on a Kratos Analytical 
Instruments Concept IIHH four-sector tandem mass spec- 
trometer equipped with a continuous flow, liquid inlet probe 
and a scanning charge-coupled device array detector (11). 
Both MSI and MS2 were operated at 1000 resolution (M, A/n) 
to determine monoisotopic masses, HPLC fractions were 
concentrated to **5 ^1 and diluted to *«15 .ul with a mixture of 
aqueous 5"p (vol ''vol) rhiogiyceroi ! 5% (vol/vol) acetonitrile/ 
0. [% TFA matrix solution. Samples were introduced into the 
mass spectrometer source at a flow rate of 3 jal/min. Sequences 
were deduced from CID spectra by using interacts e interpre- 
tation software developed in our laboratory (16). Quantitation 
estimates were based on CID and U V response of siandards. 

Immunoblot Analysis. 2D immunoblot analysis was per- 
formed as described (4) with modifications. Gels were loaded 
with 400-800 /xg of protein lysate and transferred ar 1,3 
mA./cm 2 for 2 h, and blots were stained with 0.l a c Ponceau 
S/5% (vol, vol) acetic acid. Stain was removed with 100 mM 
NaOH prior to antibody staining (data not shown). 

Data Base Searching. The OWL protein sequence data base 
(17) was searched by using BLAST ( 18) with peptide sequences 
obtained bv high- energy CID analysis or Edman degradation. 
Masses obtained by MALDI were used by MOWSE (internet 
e-mail server version 5.1; mowse@dl.ac.uk) to search a peptide 
mass data base constructed from a theoretical trypsin digest of 
the entire OWL data base (9), Typical parameters employed were 
a \5% gel-derived protein-mass tolerance, a 2- or 3-Da peptide- 
mass tolerance, and a partial cleavage scoring factor of 0.4. 

RESULTS AND DISCUSSION 

Fig, 1 illustrates a representative 2D preparative gel containing 
human A3 "5 melanoma proteins from whole ceil Ivsates. The 
spots selected for mass spectrometric analysis were chosen be- 
cause cf their abundance, reproducibility, and relative isoiation. 
Our strategy for the identification and characterization of these 
proteins is outlined in Fig. 2. Protein spots from several prepar- 
ative gels were excised, and spots of identical mass pi were pooled 
before purification from the gel matrix, digestion with trypsin, 
and peptide separation by HPLC. The yield and number of 
peptides recovered from each spot were not always proportional 
to the number of gel plugs excised or the amount and size of 
protein present, suggesting differential digestion susceptibility, 
sample handling loss, and occasional protein comigration. For 
each spot, sub-pmol aliquots of HPLC fractions were analyzed by 
MALDI. The list of masses obtained from each protein digest 
served as the focal point for guiding subsequent experiments for 
identifying and characterizing primary structural features in the 
protein(s). The number of peptides recovered from digestion of 
a protein in a 2D PAGE spot varied and depended on protein size 
and the occasional comigration of other proteins. Experimentally 
determined masses were used with MOWSE data-base searching 
to match theoretical peptide masses and attempt to predict 
protein identities. HPLC fractions containing peptide masses 
below 2 kDa were subjected to high-energy CID for peptide 
sequencing. Protein identities were established by using the 
determined sequences to search the OWT protein sequence data 
base by using BLAiT. Edman degradation was used to sequence 
peptides with masses greater than **2 kDa. since larger peptides 
ionize less efficiently by liquid secondary ion mass spectrometry. 
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Fig. 1. Coomassie blue G-250-stained 2D preparative gel of 
human A375 melanoma proteins, numbered as in Tables 1 and 2. One 
milligram of total protein was loaded. The amount of protein esti- 
mated by densitometry in designated spots varied from 0.6 to 2 9 

Tables 1 and 2 summarize peptide mass data, sequences deter- 
mined or attributed by mass, and data-base search results tor all 
spots studied. Ideally, we seek to attribute every mass to a unique 
sequence either by sequencing each peptide or matching it by 
mass to a peptide from the characterized protein. 

From the 39 peptide masses from spot 3 listed in Table 1, 
MOWSE predicted the protein identity as a-enolase. Although 
most cell types demonstrate a-enolase activity, distribution of the 
specific enolase isoenzymes in biologically active dimers may be 



Remove protein^) from gel spot by 
electro-elution, in-gel digestion, or electro-blotting 




design oligo- 
nucleotides 
for cloning 



Determine intact masses 
of protcinis) by 
ESI or MALDI MS 



F:u. 2. Strategy for identifying and characterizing proteins sepa- 
rated by 2D PAGE. ESIMS, electrcspray ionization mass spectrometry. 
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tnsue speufic. Only the a- and y-homodimers can be detected in 
scrum from patients with malignant ocular melanoma [V-)). 
Daring sequence analysis on selected peptides from spot 3, we 
also found a cytokeratin sequence LASYLDK that is present in 
many type i cytoskeletal keratins (mass. 'pi and peptide masse; 
best matched cytokeratin 15). This indicated that at le.ist two 
proteins had c emigrated. Immunoblot analysis confirmed that 
a-enola.se is present. Furthermore, the presence of a cytokeratin 
is consistent with the localization of many keratin subtypes to this 
gel region (20). However, the presence of cytokeratin was not 
predicted by VIOWSE when masses exclusive of those attribut- 
able to a-enola^e were used. 

Protein disulfide isomerase (PDI) and stathmin were found in 
spots 30 .uid 24, respectively, and MOWSE readily predicted 
both. Sequencing of several peptides by high-energy CID estab- 
lished the^e identities. PDI, which catalyses formation and inter - 



conversion of disulfide bonds in the endoplasmic reticulum, has 
been implicated in the activation of interferon-inducible genes in 
chronic myelogenous leukemia cells (21). Immunoblot analysis 
confirmed the presence of PDI in spot 30 and showed stathmin 
presence in spot 24 and in two nearby (more acidic) spots. 
Serine-phosphorylated forms of stathmin have been found in 
T-iymphocyt.es (22). The ability to suggest additional posttrans- 
lat ionally modified or genetically variant isoforms of a protein was 
one motivation for incorporating immunoblot analysis into our 
strategy. 

In spot 33 fibroblast nonmuscle tropomyosin was identified 
following high-energy CID. Similarly, another form of tropo- 
myosin, cytoskeletal type, was identified in spot 34 The two 
protein sequences are 16% identical. High-energy CID showed 
that the masses of the N-terminal peptides from both proteins 
differed by only I Da, due to an internal sequence difference 



Table 1. Summary of data obtained for nine human melanoma pi -.items from eight 2D PAGE gel spots 
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Table 2. Summary > f data obtained for trmsephosphate isomerase from 2D PAGE gel spots 37 and 42 
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•The difference betweer. le mejsurec mass and calcula cc mass (,a%-;;j£;e :sotopic). 

f Abbreviation.';: ( ), rest-iuei before/after peptide; Cam, jcr-. .anide-rroiJjfico Cys; Cjiro Cam (oxicized), acX. aceiy.at'.-d J 1 .' :erminu';: c:X. carbamylated N terminus; 

Mso, Met sut 'oxide. 
Sequenced b> Edrt an ;l-:gi uaution. 
^Mas.es neither ideruiiie-j f.^r jttnbutcd, 
'Mas.,ei ambiguous m-e;--jhae ,t cy;ot:era:in 
J lncontlLisive search ra:;u ;. 



of ITTI (spot 34) vs. LNSL (spot 3?). The pr^ience of 
N-terminal acetylation h evident in Fig. 3 from the 42-Da 
increase in the mass of all N-terminally derived a - and ^-t)pe 
ions Leucine at residues 3 and 6 ls evident in Fig. 3 from the 
mass values of the w6 (656,3 Da) and ;v9 (970.-1 D.i) ions 
resulting from iide-chain fragmentation Isoieucine was like- 
wise assigned at those same positions in the ;orresponding 
peptide from spot S-4 (data not shown). 

Differentiation and unambiguous structure assignment of 
these similar peptides illustrates the power of high-enetgy CID ;n 
protein primary structure determination. Methocolog) arid frag- 
ment- ion nomenclature for high-energy CID are more thor- 
oughly described elsewhere (12, 23). Although their biological 
function in nonmuscle ceils is not clear, the tropomyosins are 
multiple isoforms which complex with microfilaments (24) While 
spot 24 was readily predicted as cytoskeietai tropomyosin hy 
MOWSE, fibroblast nonmuscle tropomyosin was the on a- iso- 
form among several ambiguous possibilities predicted for ipot 33. 
Immunoblot analysis confirmed that tropomyosin is present in 
spots 33 and 34 and two more basic spots nearbv. 

Nucleoside diphosphate kinase A (NDPK-A;, was identified 
in spot 36 following high-energy CID. NDPK catalyzes the 
phosphorylation of nucleoside 5' diphosphates, NDPK-A is 
the product of the im23 gene, which, when overexpressed.. 
decreases tumorigenesis of melanoma cell lines ' 27>). MOWSE 
results for this spot were inconclusive, due m part to chemical 
modification of n*o peptides. 

Cu/Zn superoxide dismutase (Cu/Zn SOD) and galaptin were 
identified in spots 35 and 41, respectively, following high-energy 
CID Immunoblot analysis confirmed the presence of both 
Cu, Zn SOD and ^aiaptm. Cu/Zn SOD catalyzes the conversion 
of toxic superoxide to hydrogen peroxide. Gaiaptin. a lectin 



ptesent in many tissues, has been found in several skin tumor 
r.pes including melanoma, ana its reduced expression level has 
been suggested as a means of diagnosing malignancy (26). 
MOWSE was unable to predict the identity of both of these 
proteins from the obtained MALDI data. Only 15-25% of the 
MALDI masses obtained for spots 3i and 41 were consistent with 
unmodified tryphc peptides. Our studies show that for any given 
MOWSE search, numerous proteins present in the data base 
appear to be randomly capable of matching ^30% of the masses 
in a given list. Below this level we have little confidence in 
MOWSE predictions and find sequence determination necessary 
for protein identification. 

Triose phosphate isornerase (TPI), present in both spot 37 and 
spot 42. was readily predicted by MOWSE (Table 2). The success 
ot MOWSE is noteworthy despite the fact that several of the 
MALDI masses are quite inaccurate (an error of more than ±3.5 
Da, attnbuted to an inconsistently performing laser pov/er sup- 
ply). Subsequent high-energy CID and Edman degradation elu- 
cidated the sequences of several covalently modified peptides and 
established the identity of TPI In both spots. Peptides that 
represented a nonspecific trypsin cleavage between His-95 and 
Ser-96 and chemical modifications, including acryiamide- 
modified cysteines, oxidation, and carbarnylation. were found. 
The chemical modification of cvsteine arose from protein inter- 
action with the gei matrix Carbarnylation is likeiy l consequence 
ot using 2 VI urea buffer during en;rvmatie digestion of spot 37. 
Ir.-gel digestion with 100 mM ammonium bicarbonate was done 
for spot 42 rather than elect roeluticn. thus eliminating the need 
for 'area. 

TP!, which catalyzes the interconversion of dihydroxy ace- 
tone phosonate and giyceratdehyde 3-phosphate in giycolysis. 
giuconeogenesis, fatty acid synthesis, and the pentose shunt, is 
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F:g. 3. High-Energy CID spectrum of acctylated N-tcrmin.il trvp- 
tic peptide of nonmuscie tropomyosin (1-10 pmc-l) fr^m spot 3:, MM* 
- 1199.7 (monoisotopic mas-.), Sequence: CKjCO • AGLNSLEA VKR. 
Peptide backbone cleavage ions associated wi:h ch irge retention at :he 
C terminus are denoted oy x.y, and z; and at the N terminus bv « jn J 
if.'. Side chain fragment ion 1 ; ire denoted by .-■ ant! w 

the product of a single gene. However multiple eiectrc- 
phoretic forms have been ooserved. and sone result fron 
dcarmdations at Asn-71 ind Asn-15 or oxidation of Os-126 
(-7). By high-energy CAD, an Aj»n-7] containing peptide in 
spot 37 (mass 1620.9 in Table 2) was not deamidated. The 
corresponding peptide v. as not found in ipot 42. All five 
cysteines in TPI were shown by high-energy CID, Edman 
degradation, ot mass matching to be consistent with aer'i- 
aniide modification for spots 31 and 42. Throughout this work 
we have found cysteines exclusively in the acryiamide-medified 
form, despite inclusion of the antioxidant scdiuin thioglycolale 
during electrophoresis. 

We have not currently pursued the strategy outlined in f**g. 
2 which uses mass spectrometry to determine accurate masses 
on intact proteins isolated hy 2D PAGE. For proteins in 
solution, mass determinations accurate to U.OVb ( z2 Da or. a 
20-kDa protein ) may be obtainable by MALDl or eiectrcsprav 
ionization mass spectrometry (28). Accurate intact masse; 
could suggest the existence and possible identity ot posttrans- 
lationai modifications oi confirm an expected protein se- 
quence. However, mass determination on intact proteins iso- 
lated by 2D PAGE is confounded by problems associated wr:h 
removal of stained protein from the gel matrix m a rnrrrut 
amenable to ma.ss spectrometry. Initial attempts at MALDI 
from proteins and peptices biotted onto pcly\ inylidene d;f- 
luoride and nylon membranes are encouraging (29-31) 

Since our strategy is ba-ed on obtaining sequence inlWir.a- 
tion at tmol to low pmol levels, discovery of previously 
unknown peptide sequences will enable design of oligonucle- 
otide probes for gene clening. Furthermore, the growth of 
sequence data bases fuelet; by the Human Genome Project w ;li 
increasingly reduce the chances of discovering unknown pro- 
teins. However, the discriminating power of a peptide mass- 
matching strategy employing ±2-3 DaMALDI mass data will 
progressively decline with data-base growth, thus producing 
increasing numbers of inconclusive results, necessitating se- 
quencing. Enhanced mass accuracy obtained oy incorporating 
electrospray ionization mass spectrometry or improvements :n 
MALDI technology would siow the anticipated decline n 
discriminating power of mass-matching strategies bv reducing 
the effective portion of the data base considered in a searcn. 
Hence, when protein recognition is sought without need for 
primary structure characterisation, the identities of many 
known proteins may be rapidly predicted by jombining mai^s 
matching with airect MALDI analysis of unfractionated di- 
gests (5. n. 9, 101 Unfortunately, the ability to directly perform 
substantial sequence analysis is severely compromised. Of me 
11 proteins we characterized from 10 gei spots. 6 were readiiy 
precictea by mass matching with MOW'SE. and 9 contained 
unmatched covalently modified peptides. Consequently our 



results illustrate the necessity of peptide sequencing and the 
advantages of tandem mais spectrometry to rapidly and un- 
ambiguously identify proteins isolated by 2D PAGE, 
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